Chinese Named Entity Abbreviation Generation Using First-Order Logic

نویسندگان

  • Huan Chen
  • Qi Zhang
  • Jin Qian
  • Xuanjing Huang
چکیده

Normalizing named entity abbreviations to their standard forms is an important preprocessing task for question answering, entity retrieval, event detection, microblog processing, and many other applications. Along with the quick expansion of microblogs, this task has received more and more attentions in recent years. In this paper, we propose a novel entity abbreviation generation method using first-order logic to model long distance constraints. In order to reduce the human effort of manual annotating corpus, we also introduce an automatically training data construction method with simple strategies. Experimental results demonstrate that the proposed method achieves better performance than state-of-the-art approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vocabulary expansion through automatic abbreviation generation for Chinese voice search

Long named entities are often abbreviated in oral Chinese language, and this usually leads to out-of-vocabulary(OOV) problems in speech recognition applications. The generation of Chinese abbreviations is much more complex than English abbreviations, most of which are acronyms and truncations. In this paper, we propose a new method for automatically generating abbreviations for Chinese named en...

متن کامل

Chinese Named Entity Recognition Based on Hierarchical Hybrid Model

Chinese named entity recognition is a challenging, difficult, yet important task in natural language processing. This paper presents a novel approach based on a hierarchical hybrid model to recognize Chinese named entities. Three mutually dependent stages-boosting, Markov Logic Networks (MLNs) based recognition, and abbreviation detection are integrated in the model. AdaBoost algorithm is utili...

متن کامل

A Preliminary Study on Probabilistic Models for Chinese Abbreviations

Chinese abbreviations are widely used in the modern Chinese texts. They are a special form of unknown words, including many named entities. This results in difficulty for correct Chinese processing. In this study, the Chinese abbreviation problem is regarded as an error recovery problem in which the suspect root words are the “errors” to be recovered from a set of candidates. Such a problem is ...

متن کامل

Chinese NER Using CRFs and Logic for the Fourth SIGHAN Bakeoff

We report a high-performance Chinese NER system that incorporates Conditional Random Fields (CRFs) and first-order logic for the fourth SIGHAN Chinese language processing bakeoff (SIGHAN-6). Using current state-of-theart CRFs along with a set of well-engineered features for Chinese NER as the base model, we consider distinct linguistic characteristics in Chinese named entities by introducing va...

متن کامل

A Framework Based on Graphical Models with Logic for Chinese Named Entity Recognition

Chinese named entity recognition (NER) has recently been viewed as a classification or sequence labeling problem, and many approaches have been proposed. However, they tend to address this problem without considering linguistic information in Chinese NEs. We propose a new framework based on probabilistic graphical models with firstorder logic for Chinese NER. First, we use Conditional Random Fi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013